Creating and Validating Multilingual Semantic Representations for Six Languages: Expert versus Non-Expert Crowds

نویسندگان

  • Mahmoud El-Haj
  • Paul Rayson
  • Scott Piao
  • Stephen Wattam
چکیده

Creating high-quality wide-coverage multilingual semantic lexicons to support knowledge-based approaches is a challenging time-consuming manual task. This has traditionally been performed by linguistic experts: a slow and expensive process. We present an experiment in which we adapt and evaluate crowdsourcing methods employing native speakers to generate a list of coarse-grained senses under a common multilingual semantic taxonomy for sets of words in six languages. 451 non-experts (including 427 Mechanical Turk workers) and 15 expert participants semantically annotated 250 words manually for Arabic, Chinese, English, Italian, Portuguese and Urdu lexicons. In order to avoid erroneous (spam) crowdsourced results, we used a novel taskspecific two-phase filtering process where users were asked to identify synonyms in the target language, and remove erroneous senses.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Experiments on Crowdsourcing Policy Assessment

Can Crowds serve as useful allies in policy design? How do non-expert Crowds perform relative to experts in the assessment of policy measures? Does the geographic location of non-expert Crowds, with relevance to the policy context, alter the performance of nonexperts Crowds in the assessment of policy measures? In this work, we investigate these questions by undertaking experiments designed to ...

متن کامل

Developing Web-Based Semantic Expert Systems

Expert systems have provided solutions to different problems, from strategic planning of marketing to consulting in process reengineering. In general, the majority of studies published are based on advanced techniques of artificial intelligence, using specific languages or tools that require certain knowledge of reasoning processes to model information. With the advent of the Internet and its e...

متن کامل

Rsdnet: a Web-based Collaborative Framework for Building Multilingual Semantic Networks

We present a system (RSDnet) that allows non-expert Web users to contribute towards building a multilingual lexical resource. Our study focuses on the Romanian-English language pair, and the target resource is a Romanian WordNet strongly connected to the English WordNet. We use a bilingual dictionary, a monolingual definition dictionary and documents on the Web to build synsets, attach them a g...

متن کامل

The ESSOT System Goes Wild: an Easy Way For Translating Ontologies

To enable knowledge access across languages, ontologies that are often represented only in English need to be translated into different languages. Since manual multilingual enhancement of domain-specific ontologies is very time consuming and expensive, smart solutions are required to facilitate the translation task for the language and domain experts. For this reason, we present ESSOT, an Exper...

متن کامل

Lexicon+TX: rapid construction of a multilingual lexicon with under-resourced languages

Most efforts at automatically creating multilingual lexicons require input lexical resources with rich content (e.g. semantic networks, domain codes, semantic categories) or large corpora. Such material is often unavailable and difficult to construct for under-resourced languages. In some cases, particularly for some ethnic languages, even unannotated corpora are still in the process of collect...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017